Statistical Machine Translation Decoding Using Target Word Reordering

نویسندگان

  • Jesús Tomás
  • Francisco Casacuberta
چکیده

In the field of pattern recognition, the design of an efficient decoding algorithm is critical for statistical machine translation. The most common statistical machine translation decoding algorithms use the concept of partial hypothesis. Typically, a partial hypothesis is composed by a subset of source positions, which indicates the words that have been translated in this hypothesis, and a prefix of the target sentence. Thus, the target sentence is generated from left to right obtaining source words in an arbitrary order. We present a new approach, where the source sentence is translated from left to right and the possible word reordering is performed at the target prefix. We implemented this approach using a multi-stack decoding technique for a phrase-based model, and compared it with both a conventional approach and a monotone approach. Our experiments show how the new approach can significantly reduce the search time without increasing the search errors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word-reordering for Statistical Machine Translation Using Trigram Language Model

In this paper we study the word-reordering problem in the decoding part of statistical machine translation, but independently from the target language generating process. In this model, a permuted sentence is given and the goal is to recover the correct order. We introduce a greedy algorithm called Local-(k, l)-Step, and show that it performs better than the DP-based algorithm. Our word-reorder...

متن کامل

Word Reordering in Statistical Machine Translation with a POS-Based Distortion Model

In this paper we describe a word reordering strategy for statistical machine translation that reorders the source side based on Part of Speech (POS) information. Reordering rules are learned from the word aligned corpus. Reordering is integrated into the decoding process by constructing a lattice, which contains all word reorderings according to the reordering rules. Probabilities are assigned ...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Coupling Hierarchical Word Reordering and Decoding in Phrase-Based Statistical Machine Translation

In this paper, we start with the existing idea of taking reordering rules automatically derived from syntactic representations, and applying them in a preprocessing step before translation to make the source sentence structurally more like the target; and we propose a new approach to hierarchically extracting these rules. We evaluate this, combined with a lattice-based decoding, and show improv...

متن کامل

A Source-side Decoding Sequence Model for Statistical Machine Translation

We propose a source-side decoding sequence language model for phrase-based statistical machine translation. This model is a reordering model in the sense that it helps the decoder find the correct decoding sequence. The model uses word-aligned bilingual training data. We show improved translation quality of up to 1.34% BLEU and 0.54% TER using this model compared to three other widely used reor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004